-
-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bump sqlfluff from 1.4.5 to 3.3.0 in /src #3656
base: main
Are you sure you want to change the base?
Conversation
Bumps [sqlfluff](https://github.com/sqlfluff/sqlfluff) from 1.4.5 to 3.0.6. - [Release notes](https://github.com/sqlfluff/sqlfluff/releases) - [Changelog](https://github.com/sqlfluff/sqlfluff/blob/main/CHANGELOG.md) - [Commits](sqlfluff/sqlfluff@1.4.5...3.0.6) --- updated-dependencies: - dependency-name: sqlfluff dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]>
76d96df
to
9ebd94c
Compare
See some SQL formatting example as a result of: There are a few issues in 2022 that require a manual fix though:
Should we maybe just ignore past years? |
@rviscomi wasn't happy moving from this: JOIN (
SELECT
_TABLE_SUFFIX AS client,
COUNT(0) AS total
FROM
`httparchive.summary_pages.2022_06_01_*`
GROUP BY
_TABLE_SUFFIX) to this style which v2 and above insists upon: JOIN (
SELECT
_TABLE_SUFFIX AS client,
COUNT(0) AS total
FROM
`httparchive.summary_pages.2022_06_01_*`
GROUP BY
_TABLE_SUFFIX
) And I've had no luck so far getting the SQLFluff maintainers (of which I'm one btw, but don't do much with it these days) to support Rick's preferred bracketing, even optionally. So that's a sticking point for the upgrade. Other than that we'd need to update the |
Thanks for the context.
This rule is critical for indentation formatting.
|
Yeah the layout code is complex (which is why I haven't gone and fixed it myself!). I think we need to add But to be honest I don't disagree with SQLFluff and think I actually prefer it on a new line (like we do with CTEs). |
@tunetheweb just to clarify, I actually prefer the closing parenthesis to be on its own line. What I said in #3364 (comment) was I wasn't (and still am not) a fan of changing from K&R: FROM (
SELECT
_TABLE_SUFFIX AS client,
hasHeading(payload) AS has_heading
FROM
`httparchive.pages.2019_07_01_*`
) to GNU: FROM
(
SELECT
_TABLE_SUFFIX AS client,
hasHeading(payload) AS has_heading
FROM
`httparchive.pages.2019_07_01_*`
) (Specifically, the extra indentation) |
OK then I think we might be OK then as think that was made optional. Let me try and run fix and see what it looks like. |
…chive/almanac.httparchive.org into dependabot/pip/src/sqlfluff-3.0.6
@rviscomi can you spot check a few files? Will be good to get this upgraded as long been on my list to resolve this issue! |
sql/2019/media/04_09b.sql
Outdated
( | ||
REGEXP_CONTAINS(body, r'(?is)<meta[^><]*Accept-CH\b') OR | ||
REGEXP_CONTAINS(respOtherHeaders, r'(?is)Accept-CH = ') | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use K&R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you mean move the opening bracket up to previous line?
WHERE
date = '2019-07-01' AND
firstHtml AND (
REGEXP_CONTAINS(body, r'(?is)<meta[^><]*Accept-CH\b') OR
REGEXP_CONTAINS(respOtherHeaders, r'(?is)Accept-CH = ')
)
Personally I find that harder to read. We have three distinct AND
clauses but decide to put the start of the third one at the end of line 2.
To me this is clearer:
WHERE
date = '2019-07-01' AND
firstHtml AND
(
REGEXP_CONTAINS(body, r'(?is)<meta[^><]*Accept-CH\b') OR
REGEXP_CONTAINS(respOtherHeaders, r'(?is)Accept-CH = ')
)
But the linter is fine with either (through prefers the latter, hence why the auto fix moved to that when it had to fix the file for other reasons) so can correct to that if that's the preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh and the wikipedia entry has this (emphasis my own):
A multi-statement block inside a function, however, has its opening brace on the same line as its control clause while the closing brace remains on its own line
So while it makes sense to have this:
while {
...
}
as the {
is controlled by the while
and is part of that clause, I'm less convinced about independent clauses.
Surely you wouldn't do this as arguments to a function that use brackets, when they are spread over separate lines:
myfunction(
a, (
b+c
),
d
) {
But instead would do this?:
myfunction(
a,
(b+c),
d
) {
And that's exactly what we have here: independent clauses separated by an AND
(instead of a comma).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah since (b+c)
can fit on one line that's exactly what I'd do. Otherwise, I'd prefer to be consistent and use K&R.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're saying if they couldn't fit on one line you'd do this:
myfunction(
a, (
b +
c
),
d
) {
That's just weird to me, and not K&R, but OK can do that if we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should it be like this?
myfunction(
a,
(b +
c),
d
) {
Confused as to what it wanted here?
sql/2019/media/04_09c.sql
Outdated
( | ||
REGEXP_CONTAINS(body, r'(?im)<meta[^><]*Accept-CH\b') OR | ||
REGEXP_CONTAINS(respOtherHeaders, r'(?im)Accept-CH = ') | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K&R
( | ||
SELECT domain | ||
FROM `httparchive.almanac.third_parties` | ||
WHERE date = '2020-08-01' AND | ||
category != 'hosting') | ||
category != 'hosting' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K&R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have some ideas after looking into config update (excluded rules in particular).
- I would re-enable LT09 - Select targets on new lines.
There is a lot of variety in styles across chapters. Adding this would also make multicolumn selects more readable.
Examples re-enabled rule formatting:
- back on same line
SELECT DISTINCT NET.REG_DOMAIN(domain) AS domain
FROM
`httparchive.almanac.cname_tracking`,
UNNEST(SPLIT(SUBSTRING(domains, 2, LENGTH(domains) - 2))) AS domain
SELECT *
FROM pages
- now with newlines
SELECT
client,
url AS site,
jsonToKeyValueArray(events) AS events_per_site
FROM pages_events
- LT08 - Blank line separators for CTEs.
In case of CTEs it's much more readable with separators.
And we currently have most of CTEs with blank lines. Except markup and fonts chapters in 2022.
I would include it.
sql/2020/third-parties/distribution_of_size_and_time_by_third_parties.sql
Show resolved
Hide resolved
While I agree with this for top-level queries, it breaks subqueries like this:
And IIRC we have a lot of those subqueries and separating them all out would make the code very verbose.
This looks OK to enable. We used to have a lot of these: WITH cte1 AS (
SELECT a FROM table1
),
cte2 AS (
SELECT a FROM table2
)
SELECT * FROM cte1 And we didn't want the space between the CTEs but looks like most of them have spaces now and I added when adding the linter! So looks like we can enable this one. |
Looked into LT09 again - what do you mean by 'breaks subqueries'? I've found 4 files that will be definitely broken by this rule enabled, e.g. sql/2019/performance/07_03d.sql. But in all other cases these inline subqueries are as if the query says 'no need to read this one, move on' |
Yeah "breaks" is a bit strong. And it is a lot more readable with it enforced: FROM
`httparchive.pages.2019_07_01_*`
JOIN
- (SELECT _TABLE_SUFFIX, COUNT(0) AS total FROM `httparchive.pages.2019_07_01_*` GROUP BY _TABLE_SUFFIX)
+ (SELECT
+ _TABLE_SUFFIX,
+ COUNT(0) AS total
+ FROM `httparchive.pages.2019_07_01_*` GROUP BY _TABLE_SUFFIX)
USING (_TABLE_SUFFIX),
UNNEST(getElements(payload)) AS element So maybe it is OK to enable this rule? For the few we don't want this enabled for we can ignore it with #standardSQL
# 07_03d: % fast FCP per PSI by geo
WITH geos AS ( -- noqa: disable=LT09
SELECT *, 'af' AS geo_code, 'Afghanistan' AS geo, 'Asia' AS region, 'Southern Asia' AS subregion FROM `chrome-ux-report.country_af.201907`
UNION ALL |
Actually looking over the changes with Examples: FROM
`httparchive.almanac.requests`
JOIN (
- SELECT _TABLE_SUFFIX AS client, url AS page, app
+ SELECT
+ _TABLE_SUFFIX AS client,
+ url AS page,
+ app
FROM `httparchive.technologies.2022_06_01_*`
WHERE category = 'Ecommerce'
) #standardSQL
# Top JS frameworks and libraries combinations
-SELECT
- *
+SELECT *
FROM (
SELECT
client, third_party_domains AS (
- SELECT DISTINCT
- NET.HOST(domain) AS host
+ SELECT DISTINCT NET.HOST(domain) AS host
FROM
`httparchive.almanac.third_parties`
), In fact in there's very few cases where I do think it improves things. Maybe this one cause it only has a FROM
- (SELECT _TABLE_SUFFIX AS client, url AS page, getCompliantElements(payload) AS compliant_elements FROM `httparchive.pages.2019_07_01_*`)
+ (SELECT
+ _TABLE_SUFFIX AS client,
+ url AS page,
+ getCompliantElements(payload) AS compliant_elements
+ FROM `httparchive.pages.2019_07_01_*`)
JOIN |
Regarding indentation discussion, here's a config to get to the expected result:
If seems OK, I'll push the changes and we'll have a fresh look. And while I agree with @tunetheweb that inline opening bracket seems strange in expressions, So we'll get: sql/2019/media/04_09c.sql b/sql/2019/media/04_09c.sql
- `httparchive.almanac.summary_response_bodies`
- WHERE
- date = '2019-07-01' AND
- firstHtml AND
- (
- REGEXP_CONTAINS(body, r'(?im)<meta[^><]*Accept-CH\b') OR
- REGEXP_CONTAINS(respOtherHeaders, r'(?im)Accept-CH = ')
- )
+ `httparchive.almanac.summary_response_bodies`
+ WHERE
+ date = '2019-07-01' AND
+ firstHtml AND (
+ REGEXP_CONTAINS(body, r'(?im)<meta[^><]*Accept-CH\b') OR
+ REGEXP_CONTAINS(respOtherHeaders, r'(?im)Accept-CH = ')
+ ) Some more examples (a lot of things go inline): 2019/accessibility/09_02.sql b/sql/2019/accessibility/09_02.sql
-FROM
- (SELECT _TABLE_SUFFIX AS client, url AS page, getMainCount(payload) AS main_elements FROM `httparchive.pages.2019_07_01_*`)
-JOIN
- (SELECT client, page, ARRAY_LENGTH(REGEXP_EXTRACT_ALL(body, '(?i)role=[\'"]?main')) AS main_roles FROM `httparchive.almanac.summary_response_bodies` WHERE date = '2019-07-01' AND firstHtml)
-USING
- (client, page)
-JOIN
- (SELECT _TABLE_SUFFIX AS client, COUNT(0) AS total FROM `httparchive.pages.2019_07_01_*` GROUP BY _TABLE_SUFFIX)
+FROM (SELECT _TABLE_SUFFIX AS client, url AS page, getMainCount(payload) AS main_elements FROM `httparchive.pages.2019_07_01_*`)
+JOIN (SELECT client, page, ARRAY_LENGTH(REGEXP_EXTRACT_ALL(body, '(?i)role=[\'"]?main')) AS main_roles FROM `httparchive.almanac.summary_response_bodies` WHERE date = '2019-07-01' AND firstHtml)
+USING (client, page)
+JOIN (SELECT _TABLE_SUFFIX AS client, COUNT(0) AS total FROM `httparchive.pages.2019_07_01_*` GROUP BY _TABLE_SUFFIX) 2020/markup/pages_element_count_by_device_and_custom_dash_elements.sql
-CREATE TEMP FUNCTION AS_PERCENT(freq FLOAT64, total FLOAT64) RETURNS FLOAT64 AS (
- ROUND(SAFE_DIVIDE(freq, total), 4)
+CREATE TEMP FUNCTION AS_PERCENT(freq FLOAT64, total FLOAT64) RETURNS FLOAT64 AS (ROUND(SAFE_DIVIDE(freq, total), 4) 2019/performance/07_22.sql
SELECT
- _TABLE_SUFFIX AS client,
- (
+ _TABLE_SUFFIX AS client, (
CAST(IFNULL(JSON_EXTRACT(payload, "$['_cpu.Paint']"), '0') AS INT64) +
CAST(IFNULL(JSON_EXTRACT(payload, "$['_cpu.UpdateLayerTree']"), '0') AS INT64)
) AS paint_cpu_time 2020/markup/pages_markup_by_device.sql
-FROM
- (
- SELECT
- _TABLE_SUFFIX AS client,
- get_markup_info(JSON_EXTRACT_SCALAR(payload, '$._markup')) AS markup_info
- FROM
- `httparchive.pages.2020_08_01_*`
- )
+FROM (
+ SELECT
+ _TABLE_SUFFIX AS client,
+ get_markup_info(JSON_EXTRACT_SCALAR(payload, '$._markup')) AS markup_info
+ FROM
+ `httparchive.pages.2020_08_01_*`
+) |
A newer version of sqlfluff exists, but since this PR has been edited by someone other than Dependabot I haven't updated it. You'll get a PR for the updated version as normal once this PR is merged. |
Pushed all the file updates corresponding to new indents. |
Bumps sqlfluff from 1.4.5 to 3.0.6.
Release notes
Sourced from sqlfluff's releases.
... (truncated)
Changelog
Sourced from sqlfluff's changelog.
... (truncated)
Commits
75cf648
Prep version 3.0.6 (#5853)4838e89
[fix_clickhouse] Temporary Table Create AS SELECT (#5843)99b09d4
Bugfix: ST02 - Compare entire condition expression (#5850)49d48a5
Clichouse prewhere (#5849)d48ded5
BigQuery: Support missing DROP statements (#5848)019922a
BigQuery: various CREATE statements (#5846)4f6760e
BigQuery Alter Schema (#5835)d2e448a
Snowflake execute immediate from (#5836)b326066
Support dbt 1.7 (#5842)baceed9
Postgres: Create extension cascade (#5834)You can trigger a rebase of this PR by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebase
will rebase this PR@dependabot recreate
will recreate this PR, overwriting any edits that have been made to it@dependabot merge
will merge this PR after your CI passes on it@dependabot squash and merge
will squash and merge this PR after your CI passes on it@dependabot cancel merge
will cancel a previously requested merge and block automerging@dependabot reopen
will reopen this PR if it is closed@dependabot close
will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditions
will show all of the ignore conditions of the specified dependency@dependabot ignore this major version
will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor version
will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependency
will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)